Are audio or textual training data more important for ASR in less-represented languages?
نویسندگان
چکیده
State-of-the-Art speech recognizers are typically trained on very large amounts of data, both transcribed speech and texts. With the recent growing interest in developing speech technologies for languages for which only small amounts of data are accessible, collecting appropriate data is a key issue in building new speech recognition systems. This article reports on an experimental study assessing the performance of a speech recognizer for a less-represented language, as a function of the quantity of texts and transcribed speech data available for model training. The experimental results show that for supervised training with only 2 hours of manually transcribed data, the acoustic models are the weak point. With 10 hours or more of transcribed audio data, the quantity of texts has a larger affect on the error rate than the quantity of speech.
منابع مشابه
Exploring Novice Raters’ Textual Considerations in Independent and Negotiated Ratings
Educators often employ various training techniques to reduce raters’ subjectivity. Negotiation is a technique which can assist novice raters to co-construct a shared understanding of the writing assessment when rating collaboratively. There is little research, however, on rating behaviors of novice raters while employing negotiation techniques and the effect of negotiation on their understandin...
متن کاملSpeech alignment and recognition experiments for Luxembourgish
Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual ac...
متن کاملمقایسه روشهای مختلف یادگیری ماشین در خلاصهسازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت
In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...
متن کاملWiki-like Editing of Imperfect Computer-Generated Webcast Transcripts
As the use of Internet broadcasting (webcasting) increases, more webcasts will be archived and accessed numerous times retrospectively. One challenge in skimming and browsing through such archives is the lack of textual transcripts of the archived medias’ audio channel. Ideally, transcripts would be obtainable through Automatic Speech Recognition (ASR). However, current ASR systems can only del...
متن کاملAutomatic transcription of Somali language
Most African countries follow an oral tradition system to transmit their cultural, scientific and historic heritage through generations. This ancestral knowledge accumulated during centuries is today threatened of disappearing. Automatic transcription and indexing tools seem potential solution to preserve it. This paper presents the first steps of automatic speech recognition (ASR) of Djibouti ...
متن کامل